Overview
The EDL Pipeline integrates directly with NSE India’s archive endpoints to fetch official price band data and equity lists. These endpoints serve CSV files that are parsed and converted to JSON format.
NSE endpoints do not require authentication but need proper browser-like headers to avoid blocking.
Price Bands Endpoints
Incremental Price Band Changes
https://nsearchives.nseindia.com/content/equities/eq_band_changes_{date}.csv
fetch_incremental_price_bands.py
incremental_price_bands.json
Fetches daily price band changes (revisions) for securities. The endpoint URL includes a date parameter in ddmmyyyy format.
URL Format:
https://nsearchives.nseindia.com/content/equities/eq_band_changes_15032024.csv
Required Headers:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
"Accept": "*/*"
}
CSV Data Format:
The CSV contains columns like:
SYMBOL - Stock symbol
SERIES - Security series (EQ, BE, etc.)
NEW_BAND - New price band limit
OLD_BAND - Previous price band limit
DATE - Effective date
Fetch Strategy:
The script checks for files going backwards from today up to 7 days:
for i in range(8):
check_date = today - timedelta(days=i)
date_str = check_date.strftime("%d%m%Y") # Format: 15032024
url = base_url.format(date=date_str)
Response Handling:
- 200 OK: CSV file found and parsed
- 404 Not Found: No file for that date (NSE only publishes on days with changes)
- Other codes: Logged and skipped
Example JSON Output:
[
{
"SYMBOL": "RELIANCE",
"SERIES": "EQ",
"NEW_BAND": "20",
"OLD_BAND": "10",
"DATE": "15-MAR-2024"
},
{
"SYMBOL": "TCS",
"SERIES": "EQ",
"NEW_BAND": "20",
"OLD_BAND": "20",
"DATE": "15-MAR-2024"
}
]
NSE only publishes price band change files on days when actual revisions occur. The script searches backwards up to 7 days to find the latest file.
cURL Example:
curl -X GET "https://nsearchives.nseindia.com/content/equities/eq_band_changes_15032024.csv" \
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" \
-H "Accept: */*"
Complete Price Bands (All Securities)
https://nsearchives.nseindia.com/content/equities/sec_list_{date}.csv
fetch_complete_price_bands.py
complete_price_bands.json
Fetches the complete list of all securities with their current price bands. This is a larger file (~3,000+ securities) published daily.
URL Format:
https://nsearchives.nseindia.com/content/equities/sec_list_15032024.csv
Required Headers:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
"Accept": "*/*"
}
CSV Data Format:
Columns include:
SYMBOL - Stock symbol
NAME OF COMPANY - Full company name
SERIES - Security series (EQ, BE, BZ, etc.)
DATE OF LISTING - Listing date
PAID UP VALUE - Face value
MARKET LOT - Trading lot size
ISIN NUMBER - ISIN identifier
FACE VALUE - Face value per share
Example JSON Output:
[
{
"SYMBOL": "RELIANCE",
"NAME OF COMPANY": "Reliance Industries Limited",
"SERIES": "EQ",
"DATE OF LISTING": "29-NOV-1977",
"PAID UP VALUE": "10",
"MARKET LOT": "1",
"ISIN NUMBER": "INE002A01018",
"FACE VALUE": "10"
},
{
"SYMBOL": "TCS",
"NAME OF COMPANY": "Tata Consultancy Services Limited",
"SERIES": "EQ",
"DATE OF LISTING": "25-AUG-2004",
"PAID UP VALUE": "1",
"MARKET LOT": "1",
"ISIN NUMBER": "INE467B01029",
"FACE VALUE": "1"
}
]
Fetch Strategy:
Similar to incremental bands, searches backwards up to 7 days:
for i in range(8):
check_date = today - timedelta(days=i)
date_str = check_date.strftime("%d%m%Y")
url = base_url.format(date=date_str)
This file is typically 2-3 MB in size. Allow adequate timeout (15 seconds) for download and parsing.
cURL Example:
curl -X GET "https://nsearchives.nseindia.com/content/equities/sec_list_15032024.csv" \
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" \
-H "Accept: */*" \
--output sec_list.csv
Data Processing
CSV Parsing Strategy
Both endpoints return CSV data that requires careful parsing:
import pandas as pd
import io
# Decode CSV content
csv_content = response.content.decode('utf-8')
# Use pandas for robust parsing
df = pd.read_csv(io.StringIO(csv_content))
# Convert to list of dictionaries
raw_data = df.to_dict(orient='records')
# Clean whitespace from keys and values
for record in raw_data:
cleaned_record = {}
for k, v in record.items():
key = k.strip() if isinstance(k, str) else k
value = v.strip() if isinstance(v, str) else v
cleaned_record[key] = value
clean_data.append(cleaned_record)
The pipeline uses pandas for robust CSV parsing as NSE files sometimes contain irregular whitespace or encoding issues.
Whitespace Cleaning
NSE CSV files often have trailing/leading whitespace in column names and values. The pipeline cleans this:
# Clean column names
key = k.strip() if isinstance(k, str) else k
# Clean values
value = v.strip() if isinstance(v, str) else v
NSE uses ddmmyyyy format (e.g., 15032024 for March 15, 2024):
from datetime import datetime, timedelta
today = datetime.now()
date_str = today.strftime("%d%m%Y") # Outputs: 15032024
Error Handling
404 Not Found
404 errors are common and expected:
if response.status_code == 404:
print(f"No file found for {date_str} (404).")
# Continue checking previous days
NSE doesn’t publish files on weekends and holidays. The 7-day lookback ensures we find the latest available file.
Parsing Errors
Handle CSV parsing failures gracefully:
try:
df = pd.read_csv(io.StringIO(csv_content))
except Exception as parse_error:
print(f"Error parsing CSV for {date_str}: {parse_error}")
continue # Try next date
Network Timeouts
Use appropriate timeouts to avoid hanging:
response = requests.get(url, headers=headers, timeout=10) # Incremental
response = requests.get(url, headers=headers, timeout=15) # Complete list
Surveillance Lists
ASM/GSM Lists
fetch_surveillance_lists.py
Google Sheets Gviz endpoint
nse_asm_list.json, nse_gsm_list.json
ASM (Additional Surveillance Measure) and GSM (Graded Surveillance Measure) lists are fetched from Google Sheets maintained by NSE, with a fallback to Dhan’s Next.js API.
List Types:
- ASM: Stocks under additional surveillance (Short-term ASM, Long-term ASM)
- GSM: Stocks under graded surveillance (Stage 1-4)
Best Practices
Use realistic User-Agent strings that mimic actual browsers to avoid blocking.
Implement 7-day lookback when fetching daily files to handle weekends and holidays.
Set appropriate timeouts (10s for small files, 15s for large equity lists).
Clean whitespace from both column names and values after parsing CSV.
Handle 404 gracefully - it’s normal for files not to exist on certain dates.
Use pandas for CSV parsing to handle irregular formatting and encoding issues.
Validate data structure before saving - ensure expected columns are present.
Common Issues
Files Not Found
Problem: Getting 404 for recent dates.
Solution: NSE publishes files after market close. Use 7-day lookback to find latest file.
Encoding Errors
Problem: CSV parsing fails with encoding errors.
Solution: Explicitly decode as UTF-8:
csv_content = response.content.decode('utf-8')
Blocked Requests
Problem: Getting 403 Forbidden or connection refused.
Solution: Ensure User-Agent header is set and mimics a real browser.
Problem: Pandas fails to parse CSV.
Solution: Use try-except and skip to next date. NSE occasionally has formatting issues in their files.
Data Freshness
NSE publishes files on a daily basis:
| File Type | Update Frequency | Typical Publish Time |
|---|
| Incremental Price Bands | As needed (not daily) | After market close |
| Complete Security List | Daily | After market close (~6 PM IST) |
| ASM/GSM Lists | Weekly | Monday mornings |
Files for the current day may not be available until after market close (3:30 PM IST) and processing time (~2-3 hours). The 7-day lookback ensures you always get the most recent data.
Integration with Pipeline
NSE data integrates with the main pipeline:
- Price bands are injected into the final JSON via
add_corporate_events.py
- ASM/GSM lists trigger Event Markers (★) in the output
- ISIN mapping from security list validates master ISIN map
# Event marker logic in add_corporate_events.py:
if symbol in asm_list:
event_markers.append("★: LTASM")
if symbol in gsm_list:
event_markers.append("★: GSM")